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I. INTRODUCTION 

Self-assembly at the molecular scale is a ubiquitous 
process in the natural world and predicted to play a sig- 
nificant role in advancing nanotechnology. The formation 
of the protein shells, known as capsids, that provide the 
packaging for spherical (or polyhedral) viruses is a par- 
ticularly familiar instance of natural self-assembly; the 
rational design of antiviral agents would benefit from an 
improved understanding of how such shells assemble, a 
phenomenon at the border between biology and chem- 
istry. Direct visualization of evolving molecular assem- 
blies is inherently difficult, since intermediate states elude 
experimental capture, and final states reveal little about 
their assembly pathways; thus the simulation of suitably 
designed models ought to prove helpful in exploring the 
processes and pathways involved. 

Virus capsids have highly symmetric shapes, reflecting 
the fact that they are assembled from multiple copies of 
either a single molecular component - the capsomer - or 
a small number of distinct capsomcrs 0. Capsid assem- 
bly, a process whose details are little understood @], is 
governed by different classes of interactions: There are in- 
teractions between capsomer proteins and nucleic acids 
(in the form of DNA or RNA) that initiate and regu- 
late the assembly process, and subsequently stabilize the 
packaged genetic material inside the capsid. More signif- 
icantly from the perspective of the present study, there 
are also interactions between the capsomer proteins that 
stabilize the shell structure itself. What makes capsid 
self-assembly an ideal candidate for simulation, despite 
this apparent biochemical complexity, is the fact that it 
is able to occur reversibly in vitro Q (with scaffold pro- 
teins participating in the growth but not affecting sta- 
bility), without the genetic material that is essential to 
the virus in vivo (in other cases the nucleic acid does 
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play a stabilizing role Q). In addition, structurally in- 
tact empty shells occur in vitro after removal of their 
contents, and viruses themselves form empty capsids |5j; 
background information of this kind simplifies the model 
design considerably, since the system need only consist 
of a very small number of well-characterized component 
types. The goal of the present work is to use molecu- 
lar dynamics (MD) simulation in modeling the capsid 
assembly process, based on suitably simplified represen- 
tations of the essential molecular components. 

Motivation for simplified descriptions, that avoid be- 
coming embroiled in the detailed physical and chem- 
ical characteristics of the capsomers, stems from the 
prominence of icosahedral symmetry among spherical 
viruses, irrespective of their biological origins. Nature 
has adopted this structural motif (the other basic design 
is a helical tube) precisely because the high degree of 
symmetry leads to a minimal set of construction rules 
Q • Since the task of the genetic information carried by 
the viral nucleic acid is not only to instruct the virus how 
to infect the host, but also to specify how it must repli- 
cate itself, if less information can be devoted the latter 
mission, more will be available for the pernicious primary 
task. At the same time, shapes based on icosahedra come 
close to maximizing the volume to surface ratio, another 
advantageous feature. Beyond their basic packaging role, 
capsomers also have an important function in the virus 
life cycle || that the simple structural models do not 
attempt to address. 

Self-assembly is related to crystallization, in the sense 
that both are governed by the laws of thermodynam- 
ics, with the obvious difference that while crystals can, 
in principle, grow without limit, viral growth is self- 
limiting. The processes are driven by bond formation 
between assembling units, whether atoms or protein com- 
plexes, with the goal of reaching the relevant minimal 
free-energy state. Assembly of symmetric structures oc- 
curs both in biological and nonbiological contexts; while 
crystal growth may not be of particular biological rele- 
vance, the formation of icosahedral fullerenc molecules is 
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an indication that some geometries are common to both. 
Analogous icosahedral motifs are to be found in geodesic 
domes; these owe their detailed structure both to the 
same minimalist construction rules, as well as to con- 
siderations of optimal rigidity. Nature embodies a great 
deal of what is considered successful engineering design, 
and indeed, many noteworthy engineering achievements 
borrow from the geometric forms found in nature, even 
though size scales and materials can differ greatly. 

In order to avoid a substantial, if not overwhelm- 
ing, degree of incorrect assembly, it seems plausible that 
the construction of viral capsids demands a generalized 
scheme independent of many of the molecular details of 
individual viruses; nonspecific assembly pathways might 
well exist, a knowledge of which could be important in 
finding ways to modify or inhibit the construction pro- 
cess, with obvious therapeutic and technological bene- 
fits. Consequently, initial exploration should focus on 
the basic 'shape' of the constituent capsomers, clearly an 
important factor in determining the assembly outcome. 
Such low-resolution descriptions amount to little more 
than caricatures, but, if general organizational principles 
do exist, they could well be sufficiently robust to reveal 
themselves in models from which extraneous detail has 
been removed. There is a longstanding tradition of work- 
ing with reduced models that has proved extremely suc- 
cessful with other inherently complex phenomena, such 
as micelle growth and protein folding, 

The use of MD for modeling capsid self-assembly was 
introduced in a previous study [9j involving polyhedra 
built from 60 triangular units. In the present paper 
the work is extended in two directions, first by showing 
how more relevant trapezoidal capsomer shapes can be 
used and larger shells constructed, and second by demon- 
strating alternative ways of designing capsomer models. 
Despite the simplifications, such simulations provide ac- 
cess to assembly pathways, and are able to predict time- 
dependent partial structure populations that can, in prin- 
ciple, be compared with experiment [ljj. The following 
sections describe alternative model designs ( Section ITU, 
the nature of the bond interactions fSection llII|) , different 
assembly scenarios (Section Hvj) including the pathways 
and supplementary interaction rules involved, computa- 
tional techniques (Section [VJ, and results (Section IV1|) 
from extensive simulations. It should be stressed that 
while the focus is primarily on the formation of capsid 
shells, the method is not in any way restricted to virus 
structures, and is applicable to supramolecular assembly 
in general; a brief demonstration (Section IVIIfl of this 
capability precedes the concluding section. 



II. CAPSOMER DESIGN 

A. General considerations 

Individual capsomers are large protein complexes; ow- 
ing to their size and complexity, such structures must 



be represented in a highly reduced form, while retain- 
ing sufficient detail to ensure meaningful behavior when 
studied by MD simulation. Viewed from this perspective, 
capsomers have two principal but not entirely indepen- 
dent characteristics. One is the effective molecular shape; 
this must be tailored to ensure capsomers fit together to 
form closed polyhedral shells. The other concerns the in- 
teractions between adjacent capsomer regions in the final 
structure; these are responsible for driving self-assembly 
and maintaining the structural integrity of the finished 
shell, and must be defined accordingly. 

There are few guidelines to aid in the model design. 
While general thermodynamic considerations can help 
choose ranges for the force parameters, the approach it- 
self is entirely empirical. Progress in this kind of discrete- 
particle modeling is tied to the advance in available com- 
puter power, with increasing computational capability al- 
lowing the incorporation of additional features that aid 
the assembly process. Other, entirely different techniques 
have also been employed. One |ll| considered the dy- 
namics of spherically symmetric particles subject to di- 
rectional interactions, whose states and binding energies 
are selected probabilistically based on rules involving lo- 
cal neighborhoods Another (T^j studied disk pack- 
ings on a spherical surface using a mean-field statistical 
mechanical analysis dependent on curvature and cover- 
age. 

No solvent is included in the simulation, an approx- 
imation borrowed from protein folding; even a neutral 
solvent would increase the computational effort substan- 
tially without influencing the outcome, not only because 
of the additional particles involved, but also because of 
the slower capsomer dynamics when moving in a solvent 
rather than in a vacuum. (In future detailed work, ac- 
counting for the effect of, e.g., pH and salt concentration 
on the pathways and the final state, will require more 
substantial models that address interfacial interactions 
and solvation effects at a molecular level. An alterna- 
tive, Langevin-like representation of the solvent contri- 
bution as small random impulses, suitably distributed 
over space and time, has also not been used; if the dy- 
namics are dominated by intermolecular forces, as is the 
case here, there will be little overall effect on the eventual 
outcome.) 

The need to assure defect-free assembly allows a de- 
sign tradeoff between model capsomer size and interac- 
tion complexity. Bonding of real capsomers involves the 
participation of a relatively large number of interaction- 
site pairs across a mutual contact surface. Although such 
interactions are not individually directional, in view of 
the substantial size of the capsomer compared to the ef- 
fective interaction range, it is unlikely that the interac- 
tions are capable of producing strongly bound states in 
which capsomers are incorrectly aligned. This is not nec- 
essarily true for simplified models with few interaction 
sites and an overall capsomer size similar to the range 
of attraction, where small clusters can become trapped 
in states corresponding to spurious local energy minima 
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instead of developing into complete shells. 

One method of avoiding this situation was introduced 
for small capsomer models and employs rules governing 
the interactions ( Section II V B(l in addition to permanent 
bonds 9]. The alternative is to use an enlarged cap- 
somer with more interaction sites and reversible bonds; 
the computational cost is increased, but the enhanced 
structural rigidity reduces the opportunity for incorrect 
bonding. These design alternatives demonstrate possible 
approaches to the problem. They also reflect a changing 
perspective brought about by the fact that the available 
computer resources continued to increase throughout the 
study; greater computing power allows better capsomer 
representations, larger systems and longer runs, all con- 
tributing to the eventual shift from permanent to re- 
versible bonding. 

B. Capsomer structure 

The simplified trapezoidal capsomer representation 
employs a rigid assembly of several soft (more precisely, 
almost hard) spherical particles arranged in a partly over- 
lapping configuration approximating the desired shape; 
specific designs are described in Section III CI and illus- 
trated in Figures ECU below. The repulsive force be- 
tween spheres used to prevent significant spatial overlap 
is based on the truncated Lennard- Jones potential 

f 4e[(cr/r) 12 - (cr/r) 6 + 1/4] r < r c 



where r is the sphere separation, a approximates the 
sphere diameter, e determines the energy scale, and 
r c = 2 1 / 6 er is the interaction cutoff; in the reduced units 
used subsequently, a = e = 1. A suitably arranged set of 
spheres provides an adequate approximation to the de- 
sired shape, but as more spheres are used the computa- 
tional effort required to evaluate the interactions between 
nearby molecules grows. The description based on pair 
interactions alone is, however, much simpler than alter- 
native shape representations that require evaluating the 
overlap of complex rigid bodies. 

Capsid size limits the amount of genetic material that 
can be stored. Capsids consisting of 60 capsomer units 
exist, but their internal volumes are insufficient for sub- 
stantial amounts of nucleic acid; shells are generally 
larger, consisting of multiples of 60, i.e., 60 T, units, 
where the triangulation number T can have values from 
1 to beyond 25. All sizes share the icosahedral symme- 
try, but since 60 is the maximum size under conditions 
of complete equivalence, the concept of quasi-cquivalence 
is invoked to explain how larger structures can be con- 
structed @, Q, 0| . Inspiration for this generalization 
came from the geometrical principles developed by Buck- 
minster Fuller for geodesic domes, and construction en- 
tails arranging 12 pentamers and certain specific num- 
bers of hexamers in a symmetric manner to produce a 



close approximation of a sphere that retains the 60-fold 
symmetry. If identical capsomer proteins are used, but a 
small amount of conformational deformation permitted 
(remaining within acceptable variations for bond lengths 
and angles) to achieve a minimum free energy structure, 
the same design principles lead to a general way of con- 
structing shells, with the ubiquitous icosahedral symme- 
try as a necessary consequence. 

Viewed from a simplified geometric perspective, hex- 
amers are planar oligomers constructed out of six trian- 
gular triplets, each consisting of three trapezoidal cap- 
somers. A nonplanar pentamer can be formed by re- 
moving one triangle from the hexamer and closing the 
gap. In the assembly of a polyhedral shell, the natural 
tendency to grow hexagons at certain positions is over- 
come by the more global free-energy benefits of form- 
ing a pentagon. Molecular 'switches', a conformation- 
modifying mechanism known as autostery, regulate for- 
mation of hexamers or pentamers, while ensuring these 
different subassemblies are positioned appropriately in 
the surface lattice. Without this conformational poly- 
morphism, assembly would produce either flat hexamer 
sheets or T=l polyhedra formed entirely of pentamers. 
In the case of quasi-equi valence, autostery represents an 
important characteristic of the process as it allows oth- 
erwise identical capsomers to occupy spatially nonequiv- 
alent locations in the shell. (Quasi-equi valence is only a 
general design consideration, and while appropriate for 
some viruses, may offer only a partial explanation for 
others; a full understanding of any given capsid structure 
requires analysis of the bonding energies of the capsomers 
themselves.) 

Implicit in the model design is the fixed shape. This 
issue is avoided when studying T=3 assembly by defin- 
ing three slightly different capsomer shapes (Section !!! C|) 
destined to occupy different classes of locations within 
the shell. Explicit inclusion of an autosteric mechanism 
to provide the conformational changes required by quasi- 
equivalence 0] - mechanical analogs of such processes 
are discussed in |17| - would complicate the models. 

The capsomer bonding forces are associated with in- 
teraction sites suitably positioned within the simplified 
structures, as shown in Figures Q-ISl below; specific pairs 
of sites interact whenever they approach to within a given 
range, and the closed-shell configuration corresponds to 
the minimum-energy state in which adjoining capsomers 
are correctly oriented; bond interactions are discussed in 
Section IIIII In a properly implemented design the only 
structures that should be capable of self-assembly are 
partial or complete shells, with energetic considerations 
excluding an enormous variety of 'mutant' structures. 



C. Specific designs 

The smaller of the two cases considered here is a cap- 
sid shell of 60 identical trapezoidal c apso mers. The shell 
can be regarded as an icosahedron ,18| each of whose 



20 equilateral triangular faces is subdivided into three 
coplanar trapezoidal units representing the capsomers of 
a T=l virus, as shown in Figure 0] below. (Earlier work 
dealt with the simpler task of assembling 60-faced 
pentakisdodecahedra from triangular units.) The lateral 
capsomer faces within the triangle are normal to the tri- 
angular plane, whereas those along the outside of the 
triangle are inclined at 20.905° to the normal, resulting 
in a dihedral angle of 138.190°. Successful assembly is 
conditional upon correct relative dimensions to ensure 
components fit together, and angles consistent with the 
overall shell curvature. 

The larger case corresponds to a T=3 virus. The cap- 
sid is based on a rhombic triacontahedron with 30 
identical rhombic faces; each face is subdivided into two 
isosceles triangles (the base angles are 58.283°, so the tri- 
angles are almost equilateral), and each of these triangles 
is then divided into three coplanar trapezoidal capsomers 
yielding a total of 180; the assembled shell appears in 
Figure El below. The lateral faces within the same trian- 
gle, and between the triangles comprising the rhombus, 
are normal to the triangular plane, while the other faces 
are inclined at 18°, producing a dihedral angle of 144°. 
As discussed in Section III Bl the use of three capsomer 
variants with slightly different face angles (and attrac- 
tive interactions only between corresponding face pairs) 
avoids the quasi-equivalence issue. 

Figure ^ shows the trapezoidal capsomer used in the 
T=l permanent bond studies. The larger, slightly over- 
lapping spheres that provide the overall shape occupy a 
single plane, while the interaction sites, represented by 
small spheres for visual convenience, determine the lo- 
cations and orientations of the lateral faces. Each bond 
requires the coupling of three complementary pairs of in- 
teraction sites. The edges have between one and three 
interior spheres, and the triangular sets of bonding sites 
extend well beyond the volume of the capsomer spheres 
(pointing in the direction of the shell interior when the 
capsomer is correctly positioned). Two representations 
are shown; one in terms of the spheres and interaction 
sites, the other a block approximating the overall shape; 
this simplified model should be contrasted with real cap- 
somers [lj consisting of long, folded proteins whose ex- 
posed surfaces form relatively complex landscapes. 

The corresponding T=3 version is shown in Figure |21 
Since the three capsomer types differ only slightly, just 
one is shown, in different orientations. It has a smaller 
planar area than for T=l in order to allow smaller shells 
relative to the size of the simulation region, but consists 
of two touching layers of spheres providing greater depth 
that helps avoid unwanted interactions. The spheres 
overlap within each layer, and this is varied to adjust 
the lateral face slopes. 

The final capsomer form, appearing in Figure^ is used 
for the reversible bond case with T=l. The shape is pro- 
duced using three layers of spheres to reduce even further 
the likelihood of incorrect bonding; this is now a more im- 
portant issue since there are no rules (see Section TlV Bp 
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FIG. 1: (Color online) Capsomer model used for T=l sim- 
ulation with permanent bonding; the spheres and interaction 
sites (small spheres) comprising the capsomer and the effec- 
tive trapezoidal shape are shown. 




FIG. 2: (Color online) Views of T=3 capsomer model. 



to help avoid interactions that do not contribute to the 
final shell. Sphere spacing within layers varies from over- 
lapping to well-separated. Each bond now involves four 
pairs of interaction sites (more closely spaced than be- 
fore); the energetic gain of a correctly aligned state is 
enhanced by distributing the interactions over more site 
pairs. Because of the increased capsomer thickness (three 
sphere layers rather than one or two) interaction sites 
can be positioned without extending beyond the actual 
area of the lateral faces, and the resulting steric screening 
helps prevent unwanted interactions. 

Figures ^ and show complete shells that these cap- 
somers are capable of forming. (The actual size of these 
shells can be deduced from the ratio of icosahedron edge 
length to radius, namely \/Z— 1 = 1.236 . . ..) The former 
shows a T=l shell; to make the bond locations visible the 









FIG. 3: (Color online) Capsomer model used for reversible bonding (T=l); different views of the sphere-based structure and 
its effective shape are shown. 




FIG. 4: (Color online) Complete T=l shell (as produced 
by the simulation) with 60 capsomers; capsomers are shown 
reduced in size so that bonds are visible. 




FIG. 5: (Color online) Complete T=3 shell shell with 180 
capsomers of three (color-coded) types. 



III. BOND INTERACTIONS 



capsomers have been reduced slightly in size. The latter 
shows a T=3 shell formed from 180 trapezoidal units of 
the three different types; here capsomers are drawn to 
show their effective sizes hiding the bonds. Due to the 
nature of the bonding forces - see Eq. J2J) - mutually at- 
tracting interaction sites are spatially coincident in the 
ground state. 



The designs leave ample scope for enhancement. More 
complex capsomer surfaces, for example, would allow the 
introduction of a 'lock-and-key' mechanism due to steric 
effects. An attempt was made to use such a technique in 
early work on triangular units, by adding an extra sphere 
to the center of a lateral face and leaving an opening 
in the complementary face, but this mechanism did not 
provide the desired additional rigidity; it was not tested 
with larger units however, since the increased size and 
multiple interaction sites accomplish the same goal of 
restricting internal degrees of freedom. 



Two fully-bonded capsomers are held together by in- 
teractions between sets of either three or four pairs of 
complementary interaction sites that can only interact 
with one another. The use of multiple sites, in addition 
to being more realistic, helps accomplish several goals: 
(a) The orientation of a lateral face and, consequently, 
the dihedral angle between capsomers, is specified by the 
plane containing the sites, (b) The overall binding inter- 
action is distributed over the contact face; this ensures 
that the total binding energy of misaligned capsomers 
can only be a fraction of the ground-state value, thereby 
lessening the stability of the bond, (c) Finally, multiple 
binding sites enhance structural rigidity by suppressing 
internal modes such as twisting or flapping. Typically 
(though not always), two capsomers are drawn together 
initially by just one of the interaction site pairs, and they 
then reorient so the remaining site pairs can participate. 

As shown in Figures HUH each of the three short cap- 
somer faces contains a single set of interaction sites, while 
the long face contains two sets. The labeling scheme used 
for the sets appears in Figure[S] Since color coding is use- 
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ful for T=3, capsomers are labeled B, G, R (blue, green, 
red) following the convention used in For T=l the 
trimer forms an equilateral triangle (as shown), whereas 
for T=3 a small change of apex angle makes the triangle 
isosceles (Section IllCj) . The five interaction sites follow 
the same bonding pattern in each capsomer, namely sites 
2 and 3 can bond, as can 4 and 5, while sites labeled 1 
bond with each other. Also shown are the site pairings 
associated with trimer, dimer, and pentamer or hexamer 
formation. Three capsomers joined by 2-3 bonds pro- 
duce a planar triangular trimer. A single 1-1 bond forms 
a dimer; it is nonplanar for T=l, while for T=3 it is pla- 
nar when two type G capsomers are involved and nonpla- 
nar in the alternative R-B case. The 4-5 bonds produce a 
nonplanar, flower-like pentamer for T=l, while for T=3, 
if all capsomers are of type B they produce a pentamer, 
or a hexamer if alternating R and G types are involved 
(in three coplanar pairs). 

The functional form of the attractive potential between 
interaction sites is a negative power of the site separation 
r, provided the sites are not too close, but for r < it 
takes the form of a narrow harmonic well; this form is 
chosen for convenience, and when the system is at rest 
r = 0. In the case of reversible bonding, the force is 
derived from the potential 

f e(l/r 2 a + r 2 /4 - 2/4) r < r h 

[ e(l/r* - l/r z ) r h < r < r a 

with typical parameter values e = 0.1, r/j = 0.3, and 
cutoff r a = 2. The interactions used with permanent 
bonding are similar, although details differ slightly; ear- 
lier work also included an explicit torsional interaction 
to accelerate the bonding process, but this was discarded 
once it became apparent that pair interactions alone were 
sufficient. 

At this juncture the two techniques diverge. In the 
approach used initially, bond formation is regarded as 
irreversible, while the alternative is to treat bonds sim- 
ply as potential wells of finite depth; these represent, re- 
spectively, the extremes of kinetically limited and equi- 
librium assembly. Physical justification for permanent 




FIG. 7: (Color online) Example of mutant capsid structure 
(permanent bonding); capsomer size is reduced, as in Fig- 
ure 21 to show the bonds (yellow particles are fully bonded, 
blue are not). 



bonds stems from possible conformational changes, and 
even cleavage, experienced by proteins in the course of 
bonding. Formation of a permanent bond between two 
interaction sites can be implemented using Eq. J5J; when 
the site separation initially falls below the potential 
is replaced by the harmonic term alone, irrespective of 
r, producing an infinitely high barrier from which es- 
cape is impossible. Implementation requires monitoring 
the identities of interacting site pairs, but this enables 
another feature, namely, that once a pair of sites have 
bonded permanently they then attract only each other, 
thereby reducing a tendency to form amorphous globules. 

The rapid assembly associated with permanent bond- 
ing implies a more promising approach than bonds sub- 
ject to breakage, but exploration reveals certain less de- 
sirable features. Stretchable bonds mean that partially 
formed assemblies are subject to structural distortions 
sufficiently large for bonds to appear between capsomers 
that, in more rigid assemblies, would be beyond bonding 
range. This leads to defects that not only prevent full 
assembly but also spawn mutant structures. Figure 
shows a rare defective assembly in which growth of a sec- 
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ond outer layer begins at a misaligned shell element. 

This situation is avoidable by making bonds nonper- 
manent, allowing energetic considerations to inhibit long- 
term survival of incorrect bond pairings. The severity of 
the problem can also be reduced for permanent bonds, 
but this requires the introduction of additional features 
- assembly or interaction 'rules' - some physically mo- 
tivated, others arbitrary; since similar issues are likely 
to appear in other kinds of self-assembly simulations, a 
brief discussion of the rule-based approach appears in 
Section HVBl 



IV. ASSEMBLY SCENARIOS 

A. Pathways 

Polyhedral shells can assemble in many ways, even af- 
ter allowing for the icosahedral symmetry. The presence 
of systematic features in the construction process is an 
important issue, since the likelihood of two arbitrary sub- 
assemblies being able to mesh successfully is low (unless 
incompatible pieces can be discarded in the process), but 
whether preferred assembly pathways exist is unknown. 
One hypothetical assembly scenario is based on a multi- 
stage process, with small clusters of specified shape form- 
ing initially, and then combining into increasingly larger 
subassemblies. These small clusters must be able to 'tile' 
the full shell, so a possible first stage is the assembly of 
triangles from the trapezoidal capsomers, while in the 
second stage these trimers are added (one at a time) 
to build full shells (combining subassemblies each con- 
sisting of several trimers is again an event with a low 
success rate). Experimental signatures of multistage as- 
sembly would be certain preferred intermediate cluster 
sizes, or the more accessible rate concentration depen- 
dence |l9l l2(ij| . In simulations that involve permanent 
bonding, assembly order can be enforced by using bond- 
ing rules, as described in Section FtV Bl 

In the case of reversible bonding, where bond breakage 
occurs as a consequence of thermal fluctuations, assembly 
order can be biased by energetic preferences; in order to 
encourage early dimcr or trimcr formation, larger force 
constants would be associated with the relevant bonds 
(here, the strength e is doubled). Left unattended, the 
system evolves to a state with many partial shells, from 
which further growth is impossible; to ensure there is al- 
ways an adequate supply of free capsomers, partial shells 
below a certain threshold size are broken up at regular 
intervals (how often depends on growth rate) by switch- 
ing off their bonding interactions for a short period. (A 
possible modification useful for quantitative work entails 
breaking up newly completed shells as well, to increase 
the number of shell- growth histories available for study.) 



B. Interaction rules 

Permanent bonds do not allow construction errors to 
be rectified, so it is crucial to reduce any opportunity for 
incorrect bond formation; this is accomplished by intro- 
ducing rules governing when attractive forces may act. 
Analogous rules could appear in real molecules through 
local conformational variation in response to changes in 
bonding state, although their existence is not readily es- 
tablished; here, rules are introduced to compensate for 
other design simplifications. As a historical aside, the 
simulations started with small polyhedra, ranging from 
tetrahedra to 32-faced 'soccer balls', with interaction 
rules playing a prominent role; rule complexity increased 
for larger polyhedra with dubious chemical justification, 
so the rule-based approach with permanent bonding was 
eventually supplanted by larger capsomers that allow 
greater freedom in positioning interaction sites and re- 
versible bonding to reduce construction errors. 

The most basic of the rules aims to avoid bond forma- 
tion in ways inconsistent with the final structure: Over 
the time interval starting when one of a capsomer's in- 
teraction sites bonds with the complementary site on an- 
other, and ending when all other sites in the set are 
joined, these capsomers cannot form other bonds. If 
construction follows a pathway in which, for example, 
trimers initially form, which then bond into larger struc- 
tures, an analogous rule applies to entire trimers. In 
order to minimize any adverse effects (but without at- 
tempting to mimic nature which deals with much larger 
systems and is not necessarily reliant on high yields), 
if bonding fails to complete within a prescribed interval 
the existing partial bond is broken; to prevent immedi- 
ate rebonding newly separated capsomers must wait for 
a certain time before they can begin bonding again. Such 
rules help ensure the release of units that cannot bond 
completely, as in the case of two capsomers attempting 
to bond simultaneously along different edges of a shell 
opening big enough for only one. 

Other interaction rules enforce assembly pathways 
( Section IIV Afl : thus if growth occurs via trimer inter- 
mediates, capsomers are first required to form trimers, 
and only when all internal bonds are complete can these 
clusters associate into larger structures. Additionally, by 
restricting the number of larger subassemblies that can 
nucleate by the joining of (ej*.) two trimers - equivalent 
to a rate-limiting process |l9| - it is possible to ensure 
a significant yield of complete shells rather than numer- 
ous fragments. Typically, this limit would be set so that 
50-75% of the trimers are used by the full shells, while 
the rest supply an adequate background concentration of 
construction material; a similar restriction is applied to 
the formation of the trimers. 

Because of the intrinsic bond flexibility, the need for 
further rules only becomes apparent as incorrect struc- 
tures are encountered; two examples will be mentioned 
here. When the pathway involves dimers, the appearance 
of unwanted bonds is reduced by requiring that after a 
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1-1 bond creates a dimer, the next must be a 2-3 bond 
between one of the dimer members and a capsomer in 
a growing shell, and only then the remaining 4-5 bond; 
since the 2-3 bond joins sites closer to the dimer cen- 
ter than the 4-5 bond, this restricts partially bonded 
units from encountering inappropriate bonding partners. 
For trimer assembly, the 2-3 bonds are used to build the 
trimer, the 1-1 bond may then join the trimer to a shell 
and, last of all, the more distant (from the trimer center) 
4-5 bonds are allowed to form. Such constrained bond- 
ing sequences might be attributable to conformational 
changes as bonding progresses. 



V. SIMULATION TECHNIQUES 

General MD methodology is discussed in |6(; here a 
brief summary of the issues relevant to capsomer sim- 
ulation will suffice. Interaction calculations are carried 
out efficiently using neighbor lists; list construction fol- 
lows the procedure used for monatomic fluids. Separate 
lists are used for short-range repulsive forces between the 
spheres giving capsomers their shape, Eq. (I}, and for 
longer-range attractive forces between interaction sites, 
Eq. J2J. The rotational equations of motion employ stan- 
dard rigid-body methods; these, together with the trans- 
lational equations, are solved using a leapfrog integrator. 
The simulation region is bounded by elastically reflect- 
ing hard walls, implemented using short-range repulsive 
forces based on Eq. Q acting normal to the surfaces; 
since visualization is important, hard walls avoid poten- 
tially confusing imagery accompanying periodic bound- 
aries. In the initial state, units are positioned on a lat- 
tice and assigned random orientations and velocities; lat- 
tice spacing determines the mean density of the capsomer 
'fluid'. Simulations based on reversible bonding - which 
avoid the complexity associated with interaction rules - 
can be run on a distributed memory (message passing) 
parallel computer for improved performance. 

Exothermal bond formation gradually heats the sys- 
tem; this problem is particularly acute due to the lim- 
ited number of degrees of freedom capable of absorbing 
the excess energy (the reason being the absence of sol- 
vent and the rigid capsomer structure) . Applying a weak 
damping force — j(v ■ r)r/r 2 along each bond resolves 
this issue, where v is the relative velocity of the interac- 
tion sites and the damping coefficient 7 = 0.1. Use of 
constant-temperature MD ensures that the overall tem- 
perature does not change despite bond formation and 
damping; the net effect is to transfer energy associated 
with internal vibration to the motion of entire clusters. 

The interaction parameters are chosen for efficient self- 
assembly while maintaining numerical stability; there is 
presently no relation to experimental association energies 
[16j. The number density affects the outcome and must 
also be established empirically: Too high a value will 
not provide adequate space for shells to grow without 
mutual interference; a density that is too low will retard 



growth due to capsomers lying beyond their attraction 
range and the lack of collisions that can, in the case of 
reversible bonding, help break off incorrectly bound units 
from partially constructed shells (the densities actually 
used are substantially higher than in experiment). 

In the case of permanent bonding, the T=l simula- 
tions employ 1000 capsomers, with 13 shells allowed to 
nucleate and run lengths of 2-300,000 steps; for the larger 
T=3 shells the system contains 4096 capsomers, 10 shells 
can nucleate and run lengths are 5-800,000 steps. Bonds 
are allocated 4000 steps to form, and if unsuccessful the 
participating sites have their attractive forces turned off 
over the next 1000. 

The reversible bonding runs for T=l follow 4096 cap- 
somers over 10 million steps, with no limit on shell nu- 
cleation. Clusters with size < 30 are broken up every 
500,000 steps by turning off their attractive interactions 
over the subsequent 10,000 steps. The existence of a bond 
(insofar as cluster measurements in Section I VII are con- 
cerned) depends on the separation of either a single pair 
of interaction sites or - the more stringent requirement 
on the separations of all (either three or four) pairs asso- 
ciated with the bond; in either case the interaction sites 
must approach to within a distance 0.2 (< r/i) to be con- 
sidered paired. Finally, in reduced MD units, capsomers 
have unit mass, the integration step is 0.005, the temper- 
ature unity, and the number density typically 0.004. 



VI. RESULTS 
A. Shell analysis 

The simulation results are both quantitative and qual- 
itative in nature; a 'snapshot' sequence recorded at inter- 
vals of 2000 steps over the course of each simulation run 
provides the data needed to recreate capsid growth for 
post-analysis. A run can require several days of comput- 
ing on a powerful workstation, but subsequent processing 
requires only minimal effort. Shell properties are read- 
ily measured, allowing the mean growth statistics to be 
analyzed, together with the behavior of individual shells 
(time resolution is limited by the snapshot interval, so 
shortlived bonds between snapshots are missed). Ani- 
mated sequences providing a condensed summary of the 
run, in full three-dimensional detail, are also available, 
although static images will have to suffice here. 

Establishing shell completeness requires (a) identify- 
ing a bonded set of capsomers using cluster analysis [fj 
with the appropriate bonding criterion, (b) ensuring their 
number equals the expected shell size, and finally (c) 
checking that each capsomer is bound to the correct num- 
ber of neighbors; if all these tests succeed then, in view of 
the comparative rigidity of the bonds, the cluster corre- 
sponds to a closed shell. It is somewhat arbitrary whether 
all site pairs, or just a single pair, must be within range; 
the two criteria tend to complement one another. The 
former, more stringent criterion produces smaller clus- 
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ters during early growth; this is helpful when trying to 
visualize the emergence of partial shells from the cap- 
somer 'soup', but omits the local environment to which 
a growing cluster might be be loosely attached. The 
latter produces larger clusters, including big, ill-defined 
structures that incorporate multiple partial shells. Dur- 
ing later stages of assembly, both criteria lead to similar 
results, since as shells near completion correct capsomer 
positioning allows all sites to participate in bonding. 

Describing the nature of incomplete shells, while 
straightforward when the imagery is available (a nearly 
complete shell is readily characterized, as is a shell with a 
localized defect), is not obviously quantifiable; since par- 
tially formed structures, even defect-free shell fragments, 
have a variety of morphologies, mechanizing their classi- 
fication is nontrivial. Each such structure can be repre- 
sented as a bonded network (or graph), and while it is 
possible to determine the topology by evaluating connec- 
tivity, and the compactness by counting missing bonds, 
it is not apparent how such information can be utilized. 
Furthermore, development is not necessarily a process 
whereby shells grow monotonically by accretion of indi- 
vidual capsomers or small subassemblies, since it is also 
possible for larger subassemblies to aggregate and (with 
reversible bonding) for groups forming partial shells to 
break away from larger structures. 



B. Growth 

The most important observation regarding the overall 
behavior, equally applicable to both permanent and re- 
versible bonding scenarios, is that polyhedral shells have 
little difficulty growing to completion, and mutant struc- 
tures are highly unlikely. Partial shells tend to have few 
voids in their surfaces, and shells nearing completion typ- 
ically have only one or two holes; more open cagelike 
structures with multiple lacunae are not encountered. 
The results also reveal, not surprisingly, that an inap- 
propriate choice of interactions (or, for that matter, even 
a slight error in defining capsomer geometry) leads to 
a wide variety of alternative structures, including open 
networks, incorrectly linked assemblies of shell fragments 
and amorphous shapes that defy characterization. Due 
to the difficulty of describing anything other than a cor- 
rect shell, this aspect of the subject is avoided, but real 
viruses experience analogous effects in unfavorable envi- 
ronments. 

The main emphasis of the analysis is on reversible 
bonding, the likely focus of future work, but a few key re- 
sults for the permanent bonding alternative are presented 
first in order to demonstrate its capability. Figure|Hlsum- 
marizes the number of complete shells as a function of 
time for several different cases, namely T=l shells grown 
using monomer and trimer pathways, and T=3 shells us- 
ing dimer and trimer pathways, with two examples of 
the former to show the variation between runs. Figure 
shows several snapshots of T=3 growth using a trimer 
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FIG. 8: Number of complete shells vs time for permanent 
bonding. 

pathway; shells are isolated from their milieu using au- 
tomated cluster analysis. 

Figure I1UI summarizes the yield of complete T=l 
shells vs time for simulations employing reversible bond- 
ing, three runs each for unweighted, dimer- and trimer- 
weighted pathways; note that the runs are now an order 
of magnitude longer. Dimer growth appears to produce 
the highest yields at intermediate times, but any observa- 
tion of this kind is tentative because the spread in trimer 
results over the runs suggests more sampling is required 
and also because the effectiveness of selectively increased 
interaction strengths in biasing the pathways has yet to 
be established. Growth snapshots using trimer weighting 
appear in Figure ITT1 

C. Shell statistics 

Despite the difficulty in quantifying shell growth, av- 
erage properties of partial shells can be used in studying 
growth trends. Capsomers belonging to incomplete shells 
can be categorized according to the degree of bonding; a 
higher number with the maximal five bonds is consistent 
with a more compact hole-free partial shell, whereas an 
increased number with just one or two bonds is an indi- 
cation that at least part of the structure is loosely con- 
nected. FigureslT^land[T3lshow the capsomer fractions as 
functions of shell size for unweighted and trimer- weighted 
assembly; averaging is over the entire run and the stricter 
(all-pairs) definition of bonding is used. Two differences 
are apparent; in the unweighted case there is a larger pro- 
portion of singly bonded units in the small (size < 20) 
clusters, while in the trimer case there is an enhanced 
preference for five bonds over four in the size range 30-50 
(possibly reflecting a drop in monomer-sized holes in the 
shell) ; both of these are indicators that trimer weighting 
leads to more compact intermediate states (dimer results 
are similar). 

Figure fHI shows how the size distribution develops with 
time; prominent features are the steady trend to comple- 
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FIG. 9: (Color online) Snapshots from a T=3 simulation with permanent bonding: the images show an early state (including 
the container for reference); three views at different times showing only the growing shells; the entire system corresponding to 
the third of these views; the final state (capsomers are colored by type). 

tion by size and then graphs the size history of the largest 
of these clusters, the 5th largest, and so on, until the 
count reaches the number of shells that grow to comple- 
tion; no attempt is made to ensure 'continuity' by track- 
ing specific clusters. The results, using the strict all-pairs 
bond definition, appear in Figure 1151 Figure 1161 shows 
the corresponding results when only a single pairing is 
required, leading to larger clusters early in the run. The 
long-term behavior in both cases is the same (graphs ter- 
minate when shells are complete). The periodic breakup 
affects small clusters until their size exceeds the thresh- 
old; there is a secondary effect on larger clusters since 
newly freed capsomers provide competition as bonding 
partners. 

The second kind of analysis tracks, as closely as possi- 
ble, the development of particular shells. This is accom- 
plished by first identifying all complete shells in the sys- 
tem at the end of the run, together with their constituent 
units, and then using this information while following the 
construction history of each shell, ft is optional whether 
to include capsomers whose shell membership is tran- 
sient, an effect that appears early in assembly, but less 
so later on when a high degree of bonding makes escape 
more difficult. Ambiguity can arise as to a shell's true 
ancestry; where there are several contributing precursor 
assemblies, it is the one most heavily represented in the 
final shell that is credited. 

Figure El shows typical shell growth histories for a 
trimer-weighted pathway. The selected shells demon- 
strate different growth characteristics; they include both 
the fastest shell to complete, and one of slowest, taking 
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FIG. 10: Number of complete shells vs time for reversible 
bonding. 



tion, a relatively narrow distribution once growth is well 
underway, and the imposed cluster breakup. The fact 
that for most of the run there are few clusters of in- 
termediate size is a consequence of breakup followed by- 
prompt attachment of newly freed units to larger assem- 
blies. At the end of the run the system consists entirely 
of shells that are either complete, or nearly so, together 
with monomers and small fragments. 

Detailed information extending beyond such system- 
wide averages is obtained by examining individual shell 
growth; two forms of analysis are demonstrated here. 
The first ranks the clusters in each snapshot configura- 
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FIG. 11: (Color online) Snapshots from a simulation with reversible bonding: early state; two views after 0.5 x 10 steps 
showing the full system and the 132 clusters of size > 10 (color coding - red particles have all five bonds in place, blue have 
< 5); the 34 clusters of size > 50 after 2.5 x 10 6 steps; two views of the final state after 8 x 10 6 steps showing the entire system, 
and just the 39 complete shells with container. 
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FIG. 12: Capsomer fractions with different bond counts vs 
cluster size for reversible bonding and unweighted assembly. 



FIG. 13: As in Figure fl~2l for trimer- weighted assembly. 



some three times longer, although in most cases a delay 
beyond about two million steps is caused by a wait for 
the final few capsomers (possibly just one) to bond. The 
strict bond definition is used and histories end upon shell 
completion; the almost monotonically increasing solid 
graphs show how many members of the final shell have 
already joined, while the more variable dashed graphs in- 
clude transient members that eventually break away from 
the shell. 

Visualization is extremely useful for probing the de- 
tails of the behavior; unfortunately its rich message can- 



not be transferred to the printed page. Suitable color 
coding, based on known final shell membership, allows 
the identification of units destined to join a particular 
shell, as well as transient members. Capsomers, liter- 
ally, can come and go; only when a capsomer is bound 
to most of its neighbors, and embedded in a substantial 
shell fraction, is it unlikely to be knocked out of posi- 
tion. Population exchanges of this kind are not readily 
characterized in a quantitative manner. 
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FIG. 14: Cluster size distribution vs time for the trimer- 
weighted case. 
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2000 4000 6000 8000 10000 

step (xlOOO) 

FIG. 15: Ranked cluster sizes for trimer-weighted assembly 
(only every 5th cluster is included); all site pairs are required 
for bonding. 



VII. OTHER SELF-ASSEMBLY EXAMPLES 

Although the focus of the paper is on polyhedral shell 
growth, with the goal of modeling capsid formation, the 
same approach works for other self-assembling systems. 
Two examples of this kind will be mentioned briefly, the 
first broadly related to micelles, the second a unique and 
somewhat improbable structure that solves a well-known 
assembly puzzle. While the former consists of large num- 
ber of identical components, the latter has highly spe- 
cific interactions between a small set of distinct com- 
ponents. The motivation underlying these examples is 
to show what can be accomplished in the simplest of 
self-assembly simulations. Related 'analog' simulations 
have been performed in the laboratory, using millimeter- 
size plastic objects in solution with appropriate adhesive- 
coated surfaces [2lj . 

The first of the systems is a fluid of rigid parti- 
cles (micelle simulations [2^ generally use flexible chain 
molecules) each having the form of a tapered cylinder 
made from a linear array of spheres of decreasing size. 
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FIG. 16: As in Figure H31 but bonding requires only a single 
site pair. 

Lennard-Jones interactions occur between spheres occu- 
pying equivalent locations in different particles. The final 
state of an 8000-particle simulation under conditions of 
gradually reducing temperature (a way of encouraging 
collapse into the ground state) is shown in Figure El it 
consists of 28 packed spherical clusters with sizes ranging 
from a maximum of 368 down to 260, one hemispher- 
ical cluster of size 183 (half the maximum size), and 
a few monomers and tiny clusters. The range of clus- 
ter sizes is determined by how many particles of a given 
shape can form a packed shell, with the absence of well- 
characterized bonding patterns accounting for the size 
variability, just the opposite of what occurs with polyhe- 
dral shells. 

The second system is an MD realization of the 'Soma 
cube' ■ This is an assembly puzzle consisting of seven 
distinct pieces, each formed from three or four unit cubes 
joined along their faces in various fixed configurations; 
the pieces can be assembled to form a cube of edge three 
in 240 different ways. For the simulation, the puzzle 
pieces are replaced by rigid bodies made of spheres lo- 
cated at the cube positions, with attractive forces be- 
tween spheres that are adjacent when all pieces fit to- 
gether in a particular solution of the puzzle. If the bodies 
are randomly placed in a large container (with reflect- 
ing walls as before), assigned random initial velocities, 
their dynamics simulated using MD, and subjected to 
a gradually falling temperature, will the puzzle solution 
emerge spontaneously? The answer is a qualified yes; in 
a substantial proportion of runs the cube assembles it- 
self. Figure E| shows the components and the assembled 
product, providing yet another example of how, if inter- 
actions are formulated properly, the outcome is correct 
self-assembly. 

VIII. CONCLUSION 

The approach described in this paper involves plac- 
ing simplified capsomer elements in a container and fol- 
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FIG. 17: Size histories of selected clusters that grow into complete shells (see Section fVIC^ . 




FIG. 18: (Color online) Tapered cylindrical particles that 
have condensed into spherical clusters. 



lowing their dynamics; the outcome is a demonstration 
that a simple potential energy function, based on struc- 
tural considerations, is essentially all that is required to 
drive the assembly of the corresponding polyhedral cap- 
sid shells. The surprising aspects of the results - given 
the absence of any a priori theoretical expectation - are 
the fast growth rates, high yields, and the avoidance of 
incorrect structures. Although the models are not rep- 
resentative of real molecules, if general principles under- 
lying capsid assembly do exist, simplified systems of this 
kind ought to embody their essence. 

The advantage of the simplified approach, once its va- 
lidity and relevance are confirmed, is that it allows ex- 
ploration of the salient features of the problem free of 
peripheral detail; the influence of shape and interactions 





FIG. 19: (Color online) Soma cube components and self- 
assembled puzzle solution. 



can be examined relatively easily, in contrast to the heavy 
computational demands of an explicit atomic represen- 
tation. While the qualitative benefits are indisputable, 
the approach is not intended for quantitative estimation, 
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where accurate molecular structure and interactions are 
likely to be important. This does not preclude basing 
systematic studies on simple models; it is possible, for 
example, to explore the effects of varying relative inter- 
action strengths to match known binding energies, or to 
introduce design modifications aimed at better approxi- 
mating major structural features of the capsomers. Such 
projects await future consideration. 
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